Performance Comparison of Statistical vs. Neural-Based Translation System on Low-Resource Languages
نویسندگان
چکیده
Abstract One of the important applications for which natural language processing (NLP) is used machine translation (MT) system, automatically converts one to another. It has witnessed various paradigm shifts since its inception. Statistical (SMT) dominated MT research decades. In recent past, researchers have focused on developing systems based artificial neural networks (ANN). this paper, first, some deep learning models that are mostly exploited in Neural Machine Translation (NMT) design discussed. A systematic comparison was done between performances SMT and NMT concerning English-to-Bangla English-to-Hindi tasks. Most Indian scripts morphologically rich, availability a sufficient corpus rare. We presented analyzed our work survey conducted other low-resource languages, finally useful conclusions been drawn.
منابع مشابه
Neural machine translation for low-resource languages
Neural machine translation (NMT) approaches have improved the state of the art in many machine translation settings over the last couple of years, but they require large amounts of training data to produce sensible output. We demonstrate that NMT can be used for low-resource languages as well, by introducing more local dependencies and using word alignments to learn sentence reordering during t...
متن کاملUniversal Neural Machine Translation for Extremely Low Resource Languages
In this paper, we propose a new universal machine translation approach focusing on languages with a limited amount of parallel data. Our proposed approach utilizes a transferlearning approach to share lexical and sentences level representations across multiple source languages into one target language. The lexical part is shared through a Universal Lexical Representation to support multilingual...
متن کاملMultilingual Neural Machine Translation for Low Resource Languages
Neural Machine Translation (NMT) has been shown to be more effective in translation tasks compared to the Phrase-Based Statistical Machine Translation (PBMT). However, NMT systems are limited in translating low-resource languages (LRL), due to the fact that neural methods require a large amount of parallel data to learn effective mappings between languages. In this work we show how so-called mu...
متن کاملEnabling Medical Translation for Low-Resource Languages
We present research towards bridging the language gap between migrant workers in Qatar and medical staff. In particular, we present the first steps towards the development of a real-world HindiEnglish machine translation system for doctor-patient communication. As this is a low-resource language pair, especially for speech and for the medical domain, our initial focus has been on gathering suit...
متن کاملStatistical Machine Translation in Low Resource Settings
My thesis will explore ways to improve the performance of statistical machine translation (SMT) in low resource conditions. Specifically, it aims to reduce the dependence of modern SMT systems on expensive parallel data. We define low resource settings as having only small amounts of parallel data available, which is the case for many language pairs. All current SMT models use parallel data dur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal on Smart Sensing and Intelligent Systems
سال: 2023
ISSN: ['1178-5608']
DOI: https://doi.org/10.2478/ijssis-2023-0007